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Speaker: Jeremy Meyer 


Jeremy Meyer heads up the Professional Services and 
Education team at GridGain Systems, the creators 
and sponsors of the Apache Ignite project. 


He is a computer scientist, philosopher, coder, 
hobbyist, very rare blogger and writer as well as a 
lover of good design and good puzzles 


Bad at solving the Rubik’s cube 


Gain 


Introduction 


Record for solving the 3x3x3 rubik's 
cube - never 

| love the engineering and design of 
oddly shaped cubes, so | collect them. 
Permutations of these cubes are 
generally simpler than the original 
3x3x3 

| wrote some naïve, recursive algorithms 
to help solve certain moves (like non- 
destructive corner swaps) 
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Why Groovy with Ignite? 


Paul King (who you have just heard) gave me the idea with his Whisky 
presentation at the GridGain Apache Ignite summit 


Ignite compute grid is quite Java focused, Groovy is very compatible 


“What if we used the dynamic, easy to code and prototype aspect of 
Groovy.. 


..With the fantastically scalable compute power of Apache lgnite's 
compute grid, and clever peer class loading?" * 


| will share my journey with you.. 


$ Apache 
43 


Ignite 


*(and fixed the problem of embarrassingly unsolved cubes on my coffee table?) 
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What is Apache Ignite? 


Apache Ignite is a distributed database management system for high-performance computing and can be 
used to power in-memory apps, as a cache, or an in-memory database, or datagrid sitting between an 
application and third-party databases. 


Apache Ignite can help to: 


co + 
CoD + 


Build real-time and event-driven 
solutions that process data with 
in-memory speed 


Scale up and out across available Take advantage of built-in SQL, 
memory and disk capacity high-performance computing and 
real-time processing APIs 
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A little more about Ignite 


Ignite automatically distributes data across 
partitions and nodes. Compute tasks can be distributed, too. 


Collocation of computations and in-memory data is Ignite’s secret 
sauce. 


A thick client connects to the cluster as its own node, and 
can run tasks, thin clients can connect via JDBC and more 
lightweight protocols 


Tasks can be Java closures, or Runnable classes 


Key Concept: Class files can be copied to server nodes..or.. S S 
serialized automatically via "peer class loading" 
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Some numbers to start.. 


Don't worry - in depth combinatorial theory or Rubik's cube math/efficiency are not 
needed here 


World record for a human solving the entire regular rubik's cube - 3.8 secs. 
World record for a robot solving the regular rubik's cube - 0.38 secs 


The 3x3x2 has 12 moves.. So with 10 recursive moves - 62 billion 
permutations (with no pruning) 


My code in Python for swapping two final corners of a 3x3x2 without 
messing up the rest of the cube .... 27 hours 


My code in Groovy for swapping two edge pieces (simpler) without 
messing up the top layers .... 5 minutes 


This is what | made the focus of the study 
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The problem space 
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The Algorithm 


Represent cube as array 
cube = (0..41).toList() 


Define “moves”, transforms which turn the cube and map pieces to different places 
moves = ['front_anticlock’, 'left_col_rot’, 'back anticlock', 'front_180.. etc. 


Implement moves as a series of swaps 
"front. clock": [[O, 6], [1, 3], [2, 0], ІЗ, 7], [4, 4], [5, 1], [6, 8], [7, 5] ete. 
front. anticlock': ПО, 2], [1, 5], [2, 8], ІЗ, 1], [5, 7] etc. 


Define Source and Target positions 

— Start 

— Test for target position 

— Apply all moves methodically 

— Check for loops and silly (back and forth) moves 
— Recurse to maximum specified depth. 
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The Algorithm in pseudo code 


def findPaths(depth, source, target, move) 41 
if (depth > MAX DEPTH or isSilly(move)) { 
return false 
} 
def res = applyMove(move dict[move], source) 
if (compare(res, target)) { 
solutions.add(solution) 
return true 
} 
for (each_move in move_dict) { 
findPaths(depth + 1, res, target, move dict[each, movel) 
} 
} 
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The plan.. vs. the approach 


| wanted the clean, dynamic, easy to write, prototyping approach of Groovy 
running in an almost continuous deployment dev environment, with an easy 
workflow to high-performance code in QA/Live 


1. Creating a thick Ignite client in Groovy 
— Easy. 
2. Passing Groovy scripts, or just a Groovy closure snippet to an Ignite compute task for 
distribution to the cluster 
— | couldn't do that without precompiling the Groovy with groovyc 
— | didn’t want to do that 


3. Finally settled on an Ignite Java *harness" task which took a Groovy script as a 
parameter and parsed and ran it on each node 
— Gave me dynamic script writing /Loading/prototyping 
— But. gives an overhead to a task - could be improved 


© 2023 GridGain Systems Gain 


Testing - The fantasy version 


* 100 nodes running on ThinkSystem SR665 V3 Rack Mounted Servers 
* Budget - USS600k 
* Not approved by Finance 
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Cobbled together some machines | had. 


An old Lenovo Laptop, Intel 15, Dual Core, 1.6ghz 8Ghz, 
running Ubuntu Linux 


A "Beelink" Celeron J3455 Quad Core 2.3 GHz, mini PC, AGB 
RAM running Windows 


A Raspberry Pi 4 Quad Core Arm V8 1.5 Ghz, 4GB RAM, 
running Raspbian 


My old MacBook Air, Quad Core 1.6 Ghz, 16GB RAM, MacOS 
My work MacBook Pro M1, 8 Core, 16GB RAM, MacOS 


Budget: SO 


Finance approved! ~ 
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Testing - The real reality. 


Cobbled together some machines | had. 


An old Lenovo Laptop, Intel 15, Dual Core, 1.6ghz 8Ghz, 
running Ubuntu Linux 


A "Beelink" Celeron J3455 Quad Core 2.3 GHz, mini PC, AGB 
RAM running Windows 


A Raspberry Pi 4 Quad Core Arm V8 1.5 Ghz, 4GB RAM, 
running Raspbian 


My old MacBook Air, Quad Core 1.6 Ghz, 16GB RAM, MacOS 
My work MacBook Pro M1, 8 Core, 16GB RAM, MacOS 


Budget: SO 


Finance approved! 
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Starting it all up... 


Very simple: 


Install the same version of Java on all hardware 
Install the same version of Ignite on all hardware 
Specify Ignite config. on all nodes (e.g. loadbalancing etc.) 
Copy Groovy.jar to classpath of all nodes 
Start them all up from SSH terminals 
* They connect to each other automatically 


* compute applyAsync(new IgniteClosure(){..}, parameters) sends off the tasks to 
all nodes, using your specified distribution strategy 


* Aggregate answers and that's it! 
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ignite; 

apply( remoteableScript) 1 
.println("Starting Groovy Script in Ignite Task:"); 
res; 


shell = new GroovyShell(remoteableScript.getBindings()); 


scriptText = remoteableScript.getScript(); 
script = shell.parse(scriptText); 
= Srt ШАО; 
8 .println("Script Done." + res); 
} catch ( ©) + 


e.printStackTrace(); 
return e.getMessage(); 
} 
[enm Fes; 


} 
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The RemotableGroovyScript 


import java.util.Map; 
import groovy. lang.Binding; 


public class RemotableGroovyScript 1 
private String script; 
private Binding bindings; 
public RemotableGroovyScript(String script, Map<String,Object> bindings) 4 
this.script = script; 
this.bindings = new Binding(bindings); 


public String getScript() 1 
ГИГА ENS: script; 
} 


ШЫР: Binding getBindings() 4 
eturn this.bindings; 
} 


public void setVariable(String variable, Object val) { 
bindings.setVariable(variable, val); 
} 
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move map = [ 

‘Tromecloedk’ |0, Gl, lil, SI, 2, Gl, 13, Ма І8, 4. (5, ІШ, ПО Bly ІП, Sl, (6, 21, ПБ, ШО, 

Do. ae. (lil, 2x). (92, “|, ІШЕ, Well, ШИ, Dil, s. 12, (26, a (7, d. [ue i5. 9, el (20, 371]. 
Trone сше КЧ |10, 21, ІШІ, Sl, (2, Sl, (В, 11, ПБ, We ПО» Ol, ІП, Sl, ПЕ, Gl, (0, СІРІ, (o. 1391); 

ІНІ, dud. (12, dj. (3, aj. ША, 30). (iS, ae, (ШО, 9). (7, ЯШ, ls, 9). e. wl, (20, al. 

пол» die" S (|9, Sl, По Vil, 2, Gl, (Б, Sl, ПБ, Sl, ПО» Zl, 7, 11, |68, Gl, I@, 15), ІШ), ШО), 

fl, ais (|2, ay. xs 19), M4, 201, ІШ5, 9l. ШӘ, y [uy al, (ШӘ, 22, ue. ie, |29, ddl. 

"sexe cuo 8 I. ИІ, 22. za, 123, |. (24, 238i, (25, 251, |09, 22), (27, 29, [29 ој. 29, 29] ls 95 8091 
(Sil, A, ПЕЙ, 40, 133, 30, ІЗІ, sil. 35, 221, 36, 391. 197. ІШІ, 199 ӘБІ, 39, 3l, |40, 37, ld, sell 
ш а 23, (22, 291 (23, 29115 (24, 221, (129 281, 27, 2105 (1228; 24, 129, 27)5 (В), 381. (Sil, 
ВАТ 


[32, 351, 183, 36), 134, 37], 5135; 361, [3657 391; 71377 401, ІЗ6, 411, 139753017 |407 311, [41,7 321], 

‘back 180527 121. 2917 2122; 2810 123572721, 1245, 201, |26: 241: |27, 2317 |28, 22] 129; 12117 (30, зе [315 7371; 
[32, 381, |83, 391, |34, 401, |35, 411, (36, 30), ІЗУ, 911, [38, 321, |39, 33], [40, 34], (АЛ, 351)... Enc. | 

] 


applyMove(move, currentPos) { 
newPos = currentPos.clone() 
move map[move].each 4 swap -> 
newPos[swap[0]] = currentPos [swap [1] ] 
} 


return newPos 
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Not quite AmdahUs law.. 


Dividing up tasks between very different 
resources needs special consideration, as 
results may not be what was expected. 


Consider: 


- A forest has 100 trees that need felling 
(invasive species) 


– Mama bear can push down 50 in 1 
hour 


— Papa bear can push down 50 in 1 hour 
— Baby bear can push down 20 in 1 hour 
— Goldilocks can push down 1 in 1 hour 
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Almost counter intuitive.. 


* Mama bear working alone can push down all 100 trees in 2 hours. 


* Divide up all trees between the parent bears, and the whole task will take 1 hour (2 x 
as fast as Mama Bear) 


* |f you divide the task up between the three bears equally (33 each), everyone has to 
wait for baby bear to finish 33 trees, and it takes about 1.5 hours (only 1.5 x as fast 
as Mama Bear) 


e |f you divide up the tasks between Goldilocks and the three bears (25 each), everyone 
has to wait for Goldilocks to push down 25 trees and the task will take 25 hours. 
(12.5 times as slow as Mama Bear) 


* Adding slower resources naively, can give a slower result than 
having a single, fast resource (that doesn't even include overhead) 
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Aha.. Ignite does Load balancing 


* Round Robin 
— distributed evenly across nodes 
— default 
* Weighted Random 
— Distributed randomly but nodes can be assigned a weight 


— Nodes with a bigger weight get proportionally more tasks assigned 
* Jobstealing 


— Distributed evenly across nodes 


— Nodes which are freer can “steal” jobs from nodes which have full queues* 
* Mama bear saying to Goldilocks "Let me help you with those extra trees" 


With Jobstealing and Weighted Random, time taken should tend towards (50+50+20+1)/100, but 
overhead and different task sizes (some of the moves are loops, and optimized for redundancy) 
makes this theoretical. 


Jobstealing has a little more overhead, but Weighted Random needs pre-knowledge about 
resources 
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Some Results 


Custer [Nodes | Average Time | Сап 


MacBook Pro only 1 70 secs [Ee] 

MacBook Air only 1 155 secs [n 

Lenovo with Linux 1 238 secs [E 

only 

У |е зо ЕЕ SJ 
опіу 

Ер so ______________ 
Two Macs only 2 55 secs* Г] 

Non - Macs only 3 171 secs ПГ _________| 

All — Random 5 60 secs* 

Weighted L 


* surprising!! 
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Conclusions and work for the future 


A possibly unrealistic use case, but very interesting all the same. 
This is a great environment for testing and playing with Ignite clusters and Groovy. 


Peer class loading working with the Groovy classloader would really mean that we could 
just pass Groovy closures as tasks 


Failing that, some libraries for easy passing of Groovy tasks (as | have done here) would 
be a worthwhile investment 


Compute tasks are very useful even without data, but there is a world of data querying to 
be explored. 


Ignite's high performance queries handle the bottleneck of data querying performance, so 
a scripting language is quite feasible in a high performance environment 
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Reach Out 


Check out these useful resources: 


* Free Apache Ignite Training 

* Free Apache Ignite Online Learning at GridGain Universit 

• Ignite Summit - Virtual Community Conference 

* Product Demo: The GridGain Unified Real-Time Data Platform 
* Apache Ignite Communit 

* Source code? e-mail me! 


E-mail: 
jeremy.meyer@gridgain.com 
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